biggest fight
These 183,000 Books Are Fueling the Biggest Fight in Publishing and Tech
Editor's note: This searchable database is part of The Atlantic's series on Books3. You can read about the origins of the database here, and an analysis of what's in it here. This summer, I acquired a data set of more than 191,000 books that were used without permission to train generative-AI systems by Meta, Bloomberg, and others. I wrote in The Atlantic about how the data set, known as "Books3," was based on a collection of pirated ebooks, most of them published in the past 20 years. Since my article appeared, I've heard from several authors wanting to know if their work is in Books3.
Technology: